OcrV1, Main, Exploration, bibRecord, 000A74

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Identifieur interne : 000A74 ( Main/Exploration ); précédent : 000A73; suivant : 000A75

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Auteurs : Majid Ziaratban [Iran] ; Karim Faez [Iran]

Source :

International journal on document analysis and recognition : (Print) [ 1433-2833 ] ; 2009.

RBID : Pascal:10-0182400

Descripteurs français

Pascal (Inist)
- Reconnaissance optique caractère, Caractère manuscrit, Reconnaissance caractère, Langage spécification, Arabe, Correction erreur, Erreur estimation, Estimation erreur.

English descriptors

KwdEn :
- Arabic, Character recognition, Error correction, Error estimation, Estimation error, Manuscript character, Optical character recognition, Specification language.

Abstract

Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.

Affiliations:

Iran

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000191
to stream PascalFrancis, to step Curation: 000586
to stream PascalFrancis, to step Checkpoint: 000189
to stream Main, to step Merge: 000A83
to stream Main, to step Curation: 000A74

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Non-uniform slant estimation and correction for Farsi/Arabic handwritten words</title>
<author><name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>15916-34311 Tehran</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>15916-34311 Tehran</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">10-0182400</idno>
<date when="2009">2009</date>
<idno type="stanalyst">PASCAL 10-0182400 INIST</idno>
<idno type="RBID">Pascal:10-0182400</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000191</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000586</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000189</idno>
<idno type="wicri:doubleKey">1433-2833:2009:Ziaratban M:non:uniform:slant</idno>
<idno type="wicri:Area/Main/Merge">000A83</idno>
<idno type="wicri:Area/Main/Curation">000A74</idno>
<idno type="wicri:Area/Main/Exploration">000A74</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Non-uniform slant estimation and correction for Farsi/Arabic handwritten words</title>
<author><name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>15916-34311 Tehran</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Electrical Engineering Department, Amirkabir University of Technology, 424 Hafez Ave</s1>
<s2>15916-34311 Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>15916-34311 Tehran</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint><date when="2009">2009</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Arabic</term>
<term>Character recognition</term>
<term>Error correction</term>
<term>Error estimation</term>
<term>Estimation error</term>
<term>Manuscript character</term>
<term>Optical character recognition</term>
<term>Specification language</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance optique caractère</term>
<term>Caractère manuscrit</term>
<term>Reconnaissance caractère</term>
<term>Langage spécification</term>
<term>Arabe</term>
<term>Correction erreur</term>
<term>Erreur estimation</term>
<term>Estimation erreur</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Slant correction is an important part of the normalization task in OCR applications. Due to some special specifications of Farsi and Arabic manuscripts, conventional deslanting methods proposed for other languages do not work properly. In this paper, a fast method is first introduced to estimate the overall tilt of a handwritten word based on directional filters. After overall deslanting, a novel non-uniform slant estimation algorithm computes the remaining slant of each near-vertical stroke of the-word, separately. Each candidate stroke is traced and its slant is calculated. A non-uniform slant correction algorithm is also proposed to reduce the remaining slants of each candidate stroke keeping the dis- tortions of other strokes of the word at a minimum level. Thanks to the special characteristics of Farsi/Arabic scripts, slants are estimated in a specific strip of the written words. A comparison between our approach and three other prevalent methods is drawn. Experiments show that the proposed overall slant estimation method not only represents the least estimation error, but is also the fastest algorithm. The best results are achieved using the proposed overall and non-uniform deslanting methods. It is concluded that successful results can be achieved by considering the special specifications of these two languages.</div>
</front>
</TEI>
<affiliations><list><country><li>Iran</li>
</country>
</list>
<tree><country name="Iran"><noRegion><name sortKey="Ziaratban, Majid" sort="Ziaratban, Majid" uniqKey="Ziaratban M" first="Majid" last="Ziaratban">Majid Ziaratban</name>
</noRegion>
<name sortKey="Faez, Karim" sort="Faez, Karim" uniqKey="Faez K" first="Karim" last="Faez">Karim Faez</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000A74 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000A74 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:10-0182400
   |texte=   Non-uniform slant estimation and correction for Farsi/Arabic handwritten words
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Non-uniform slant estimation and correction for Farsi/Arabic handwritten words

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri